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Abstract 



We consider an exponentially fitted discontinuous Galerkin method and propose 
a robust block solver for the resulting linear systems. 



< 

1 Introduction 

Let Q C IR 2 be a convex polygon, / 6 L 2 (il),g € H x / 2 {dQ) and let e > be constant. 
We consider the advection-diffusion problem 

,_i — div(eVu — (3u) = f in Q, u = g on dQ, (1.1) 

m 

where (3 G W 1,oc (£l) derives from a potential (3 = V?/>. In applications to semiconductor 
devices, u represents the concentration of positive charges, ip the electrostatic potential 



and the electric field |V?/>| might be fairly large in some parts of f2, so that (1.1 ) becomes 
advection dominated. Its robust numerical approximation and the design of efficient 
solvers, are still nowdays a challenge. Exponential fitting |2] and discontinuous Galerkin 
. £h (DG) are two different approaches that have proved their usefulness for the approximation 

of (l.lj). Both methodologies have been combined in [3] to develop a new family of 



exponentially fitted DG methods (in primal and mixed formulation). In this note, we 
consider a variant of these schemes, based on the use of the Incomplete Interior Penalty 
IIPG-0 method and propose also an efficient block solver for the resulting linear systems. 
By introducing the change of variable 

p:= e -^u (1.2) 
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problem (1.1) can be rewritten as the following second order problem 

- V ■ (/?Vp) = / in fi, p = X ondtt, (1.3) 



if) ip 



where k := ee<= and x := e £ #• An IIPG-0 approximation to (1.3) combined with a 



suitable local approximation to (1.2), gives rise to the EF-IIPG-0 scheme for (1.1). We 
propose a block solver that uses ideas from pQ to reduce the cost to that of a Crouziex- 
Raviart (CR) (exponentially fitted) discretization. By using Tarjan's algorithm, the 
associated matrix is further reduced to block lower triangular form, and a block Gauss- 
Siedel algorithm results in an exact solver. 

To give a neat presentation, we focus on the case (3 = Vip piecewise constant; if) 
piecewise linear continuous, although we include some numerical results for a much more 
general case (cf. Test 2). Due to space restrictions, we describe the method and the 
solver and show some numerical results; further extensions of the method (allowing if) to 
be discontinuous) and the convergence analysis of the proposed solvers will be consider 
somewhere else. 

2 The Exponentially Fitted IIPG-0 method 

Let Th be a shape-regular family of partitions of Q into triangles T and let h = max^er^ hr 
with Ht denoting the diameter of T for each T £ Th- We assume Th does not contain 
hanging nodes. We denote by £^ and £® the sets of all interior and boundary edges, 
respectively, and we set Eh — £% U £®. 

Average and jump trace operators: Let T + and T~ be two neighboring elements, 
and n + , n be their outward normal unit vectors, respectively (n^ 1 = n<p±). Let (^ and 
t be the restriction of C, and r to T ± . We set: 

2{C} = (C + + C), [CI = C + n + + C~n- on E e £° h , 

2{t} = (t+ + t-), [t] =r + -n+ + T--n- on E 6 £° h , 

and on e G £® we set [C] = C n an d { T } = T - We will also use the notation 



(u,w)r h = 2_. I uw dx ( u i w )£ h — y, / uwds Vu,w,<E V 1 



where V DG is the discontinuous linear finite element space defined by: 

V DG = {ue L 2 (tt) : u w e P X (T) VTeTh}, 

F l (T) being the space of linear polynomials on T. Similarly, P°(T) and P°(e) are the 
spaces of constant polynomials on T and e, respectively. For each e£4 (resp. for each 
T eT h ), let V° e : L 2 (e) — ► P°(e) (resp. V^ : L 2 {T) — ► P°(T)) be the L 2 -orthogonal 
projection defined by 

V° e (u) := 1 / 'u, VueL 2 (e), V Q T {v) := i- / v, VveL 2 (T). 

\ e \ Je I 1 I JT 



We denote by V CR the classical Crouziex-Raviart (CR) space: 

V CR ={v G L 2 (Vt) : v\ T G P X (T) VT G 7^ and P°[v] =0VeG^}. 

Note that t> = at the midpoint m e of each e G £f . To represent the functions in V DG 
we use the basis {f e ,T}TeT h ,e&e h , defined by 

VTeT h y e , T {x) G F\T) e C dT ^,rK') = 4/ Ve' G £ h . (2.1) 

In particular, any w G P*(T) can be written as w = Y^ecdT w ( m e)fe,T- 

The Exponentially fitted IIPG-0 method We first consider the IIPG-0 approxima- 



tion to the solution of (1.3): Find p G V DG such that A(p, w) = (f,w)j- h forall w G V DG 
with 

A{p,w) = (K* T Vp,Vw) Th - ({/4Vp}, {w]) Sh + (S e {lpj},P°(lwj)) £h . (2.2) 

Here, S" e is the penalty parameter and n^ G P°(T) the harmonic average approximation 
to K = ee^/ e both defined by [3]: 

1 e 



/i'/ 



n^" 1 ) " P°(e^) 



S e :=a e h e x {K* T } e , (2.3) 



Next, following [3] we introduce the local operator T : V DG — > V that approximates 
the change of variable ( |1.2[ ) : 

T W := Y, ^ w )\t =EE ^(^VKW V^ G ^ G . (2.4) 

TeT h TeT h eCdT 



By setting p := Tu in ( |2.2[ ), we finally get the EF-IIPG-0 approximation to (1.1): 
Find u h G V DG s.t. B(u h ,w) := A(1u h ,w) = (f,w) Th Vw G V DG with 

iSK^^^VT^V^k-d^VlnjJ^D^+^ellTn]},^ ^])^. (2.5) 



It is important to emphasize that the use of harmonic average to approximate k = ee 



i,/e 



as defined in (2.3) together with the definition of the local approximation of the change 
of variables prevents possible overflows in the computations when if> is large and e is 
small. (See [3] for further discussion). Also, these two ingredients are essential to ensure 
that the resulting method has an automatic upwind mechanism built-in that allows for 
an accurate approximation of the solution of (1.1) in the advection dominated regime. 



We will discuss this in more detail in Section [3 

Prior to close this section, we define for each ee4 and T G Th' 

if) m , e := min ?p (x) ip m}T := minV>(ar); if) myT < Vw for e c ^T . 
In the advection dominated regime e ^C \(3\h = \Vip\h 

V° T (e- We) ) * 6 2 e"^ K(*-* ,€ ) ^^ . (2.6) 



The first of the above scalings together with the definitions in (2.3) implies 



-e~^~ , S e ~ ^ 6 e = dTl n dT2 ■ (2 ' 7) 



3 Algebraic System &; Properties 

Let A and B be the operators associated to the bilinear forms A(-,-) (2.2) and B(-, 



(|2.5|), respectively. We denote by A and B their matrix representation in the basis 
. In 
and 



We,T}TeT h ,eee h (2.1). In this basis, the operator T defined in (2.4) is represented as a 
diagonal matrix, 



i). Thus, the approximation to (1.3) and (1.1) amounts 



to solve the linear systems (of dimension 2n e — n^; with n e and rif, being the cardinality 
of Eh and E® , respectively): 



Ap = F , and Bu = p or Bu = F , (3.1) 

where p, u, F and F are the vector representations of p, u and the rhs of the approximate 



problems. From the definition (|2.4|) of X it is easy to deduce the scaling of the entries of 
the diagonal matrix I 



(A \2n e -n 6 



(A. \2ne-n b j 



di,j = i^j. 



We now revise a result from [TJ : 

Proposition 3.1 Let Z C 1^ DG 6e t/ie space defined by 

Z={ze L 2 (VL) : *| T e P : (T) VT G % and V°{v} = 0VeG^}. 

Then, for any w G V DG there exists a unique w cr G V CR and a unique w z G Z such that 
w = w cr + w z , that is: V DG = V CR © Z. Moreover, A(w cr ,w z ) = Vw cr G V CR } and 

\/w z ez. 



Proposition 3.1 provides a simple change of basis from {<p e ,T} to canonical basis in V 



CR 



and Z that results in the following algebraic structure for (3.1): 



A 



A zz 

A vz A vv 



(3.2) 



Due to the assumed continuity of ip, D is still diagonal in this basis. The algebraic 



structure (3.2) suggests the following exact solver: 



Algorithm 3.2 Let u be a given initial guess. For k > 0, and given u^ = z^ + v^, the 
next iterate Uk+i = Zk+i + Vk+i is defined via the two steps: 

1. Solve B(u z k+1 ,w z ) = (f,w z ) Th \/w z eZ. 

2. Solve B(u c k r +1 , w cr ) = (/, w cr ) Th - B(u z k+1 , w cr ) \/w cr G V CR . 
Next, wet discuss how to solve efficiently each of the above steps: 



Step 2: Solution in V . In pQ it was shown that the block A vv coincides with the 



stiffness matrix of a CR discretization of (1.3), and so it is an s.p.d. matrix. However, 



this is no longer true for M vv which is positive definite but non-symmetric. 

B(u cr , w cr ) = {K* T VZu cr , Vw cr ) Th V u cr , w cr e V CR . 



In principle, the sparsity pattern of M vv is that of a symmetric matrix. Using (2.6) and 



(2.3), we find that the entries of the matrix scale as: 



Since ip is assumed to be piecewise linear, for each T, it attains its minimum (and also 
its maximum) at a vertex of T, say x Q and il) m ,e is attained at one of the vertex of the 
edge e, say x e . In particular, this implies that 

( x e = x 

1pm,e ~ 1pm,T ~ V> • {x e - X ) = (3 • (x e ~ X ) 

\j3\h X e ^X 



Hence, in the advection dominated case e <^C \/3\h some of the entries in (3.3) vanish (up 
to machine precision) for e small; this is the automatic upwind mechanism intrinsic of 
the method. As a consequence, the sparsity pattern of ~B VV is no longer symmetric and 
this can be exploited to re-order the unknowns so that ~B VV can be reduced to block lower 
triangular form. 
Notice also that for Th acute, the block A vv being the stiffness matrix of the Crouziex- 



Raviart approximation to (1.3), is an M-matrix. Hence, since the block M vv is the product 
of a positive diagonal matrix and A vv , it will also be an M-matrix if the triangulation is 
acute (see [2]). 

Step 1: Solution in the Z-space. In [1J it was shown that A zz is a diagonal p.d. 
matrix. This is also true for B 2 ^ since it is the product of two diagonal matrices. The 
continuity of ip implies 

B(u z ,w z ) = (S e nu z lV° e ({w z l)) £h \/u z ,w z eZ. (3.4) 

Using (2.6) and (|2.3) we observe that the entries of M zz scale as: 



which are always positive, so in particular M zz it is also an M-matrix. 

4 Block Gauss-Siedel solver for F Ci? -block 

We now consider re-orderings of the unknowns (dofs), which reduce M vv to block lower 
triangular form. For such reduction, we use the algorithm from [1] which roughly amounts 



to partitioning the set of dofs into non-overlapping blocks. In the strongly advection 
dominated case the size of the resulting blocks is small and a block Gauss-Seidel method 
is an efficient solver. Such techniques have been studied in [5J for conforming methods. 
The idea is to consider the directed graph G = (V,E) associated with W v G IR" crX " ,cr ; 
G has n cr vertices labeled V = {1, . . . , n cr } and its set of edges edges E has cardinality 
equal to the number of nonzero entries^ of M vv . By definition, (i,j) G E iff b^ 7^ 0. 
Note that in the advection dominated case, due to the nonsymmetric pattern of M vv 
(caused by the built-in upwind mechanism), we may have (i,j) G E, while (j,i) <£ E. 
Then, the problem of reducing M vv to block lower triangular form of M vv is equivalent to 
partitioning G as a union of strongly connected components. Such partitioning induces 
non-overlapping partitioning of the set of dofs, V = U^Wj. For i = 1, . . . , JV&, let mi 
denote the cardinality of w»; let Ij G ]R™ crXmi be the matrix that is identity on dofs in Wj 
and zero otherwise; and M™ = lf~B vv Ii is the block corresponding to the dofs in Wj. The 
block Gauss-Seidel algorithm reads: Let u^f be given, and assume u c ^ has been obtained. 
Then u^\ 1 is computed via: For % = 1, . . . Nf, 



u 



k+i/N b 



u fc+(;-i)/7V 6 + 1; 



avv\-ljrT 



IJ(F- 



nvv cr 

T> "" fc+(i _ 1)/iV() 



)• 



(4.1) 



As we report in Section M in the advection dominated regime the action of 
be computed exactly since the size of the blocks M™ is small. 



omA— 1 



can 



5 Numerical Results 



Test 1: 50 blocks 156 Dofs 



Test 2: 1 22 blocks 31 6 Dofs 
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(a) Test 1 with e = 1(T 5 
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(b) Test 2 with e = 1CT 7 



Figure 5.1: Plot of the connected components (blocks) of ~B VV created during Tarjan's 
algorithm. 



We present a set of numerical experiments to assess the performance of the proposed 



block solver. The tests refer to problem (1.3) with e = 10 3 , 10 5 , 10 7 , and Q is trian- 
gulated with a family of unstructured triangulations Th- In the tables given below J = 1 



^ach dof corresponds to a vertex in the graph; each nonzero entry to an edge. 



corresponds to the coarsest grid and each refined triangulation on level J, J = 2,3,4 is 
obtained by subdividing each of the T e Th on level (J— 1) into four congruent triangles. 
From the number of triangles nx the total number of dofs for the DG approximation is 
3ut and n e — n& for the CR part of the solution. 

Test 1. Boundary Layer: fi = (— 1, l) 2 , (3 = [1,1]*, rvr = 112 for the coarsest mesh 
and / is such that the exact solution is given by 

/ l + e -2/6_ 2e (*-l)/e\ / ! + e -2/6 _ 2e (v-l)/« 

u(a;, y) = Is + = = 7 - I I 3/ + 



1 - e" 2 A 7 V 1 - e_2A 

Test 2. Rotating Flow: Q = (-1, l) x (0, 1), / = and curl/3 ^ 0, 

2y(l - x 



P 



-1 1 

„2\ 



-2x(l - y 



2'i 



1 + tanh (10(2rr + 1)) a; < 0, y = 0, 
9{x,y) 

elsewhere . 



We stress that this test does not fit in the simple description given here, and special care 
is required (see [3]). For the approximation, for each T e 7a, with barycenter (xt,Ut), 
we use the approximation /3| T « V"0|t with -0| T = 2y T (l — x^)x — 2x^(1 — 2y^)y (and 
so V> discontinuous). The coarsest grid has tit = 224 triangles. 

In Figure [5TT] the plot of the connected components of the graph depicting the blocks 
for M vv created during Tarjan's algorithm, on the coarsest meshes is shown; for Test 1 



with e = 10~ 5 and for Test 2 with e = 10~ 7 . In Tables 5.1 are given, the number of blocks 
Nb created during Tarjan's algorithm. We also report in this table the size of the largest 
block created (Mb maximum size) and the average size of the blocks n av . Observe that 
in the advection dominated regime the largest block has a very small size compared to 
the total size of the system. After Tarjan's algorithm is used to re-order the matrix M vv , 



we use the block Gauss-Seidel algorithm (4.1) where each small block is solved exactly. 
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